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I INTRODUCTION 



There is a need for Proof and Test to provide users of NUWES data with 
estimates of actual vehicle positions as a function of time and also to 
provide some indication of the quality of those estimates (that is, to provide 
some indication of how close those estimates are to the actual positions). 
There is also a need to provide Instrumentation with feedback on the 
capability of the position location system for providing data satisfying the 
needs of the users. The purpose of this report is to propose a measure of 
quality which will be called a Figure of Merit to satisfy those needs. 

The proposed Figure of Merit (FM) is based on the statistical concept of 
a confidence interval. It is a numerical value for a statistical bound on the 
difference between the estimated coordinate value (xe) and the actual or true 
value (xt). In lay terms, the values of FM provide readily understandable 
indicators of the quality of the estimates and hence of the data used to 
establish them. 

Application of the proposed FM is illustrated for a 40 point segment of 
NUWES data recorded on a trial run involving two vehicles. The data segment 
presents a fairly wide range of difficulty for both the position location 
system and the smoothing process. 
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II THE PROPOSED FIGURE OF MERIT 

The proposed FM incorporates the occurrence of missing and outlier points 
as well as the magnitude of scatter or noise in the data segment to establish 
a measure of the quality of the data segment used to produce the estimate xe 
at the observation time in the center of the segment. A 7-point Least-Squares 
Polynomial model (Ref. 4) is used in the smoothing process to determine values 
for the estimated values (the xe's) and will also provide a basis for establish- 
ing values for the FM's to be associated with the xe's. Each of the factors 
contributing to the FM's is described below. 

1. Missing data points in a data segment degrade the quality of the 
estimates by decreasing the number of legitimate observations in the 7-point 
segments used to establish the estimates. The smoothing process requires that 
temporary values be provided at the missing points so that a sequence of seven 
consecutive observations are available for the smoothing. These temporary 
values are produced by linear averaging of the adjacent legitimate 
observations. Smoothing of these temporary values is repeated until the 
residual error (the difference between the presmoothed values and the smoothed 
values at the smoothing point) is within acceptable bounds. This bound has 
been set at unity (1) so that the residual error is well within the noise 
level present in good quality data and hence does not contaminate the 
information provided by the legitimate observations in the data segment to a 
serious extent. 

2. Outlier data points are observations that are inconsistent with 
neighboring values. These are identified by using sequential differences 
(Ref. 2) with any observation having a fourth order difference of 50 in 
magnitude being identified as a potential outlier. 
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Since outliers also contaminate the fourth order differences of adjacent 
observations but to a lesser extent, the largest fourth order difference 
exceeding the threshold is identified as the only outlier. After smoothing 
this outlier, sequential differences should be recalculated to determine 
whether other outliers occur in a data segment. As with missing points, the 
smoothing process should be iterated (repeated) until the residual error at 
that point is negligible. 

The number of legitimate observations in a 7-point data segment to be 
used for estimating the value xe is 

NS = 7 - M - W 

where M is the number of missing points and W is the number of outliers or 
wild values in the segment. 

3. A measure of the scatter in a data segment is obtained in the 
smoothing process. This is the standard deviation SDRK of the residual errors 
of the data segment when Kth order polynomial is fitted to the segment. SDRK 
is determined by finding the sum of the squares of the residual errors (SSRK) 
and the appropriate degrees of freedom DFK where 

NS - 2 for k =1, 

DFK = NS - 3 for k =2, 

NS - 4 for k =3. 

(Note that 2,3,4 are the number of parameters in polynomials of order 1,2,3 
and are called the degrees of freedom lost when polynomials of those orders 
are fitted to the data segment.) SDRK is defined as 

SDRK = SQR ( SSRK/DFK ) . 

It can be established that 

SSR1 >= SSR2 >= SSR3 >= — . 

On the other hand, 
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DF1 > DF2 > DF3 > 



Thus it is possible, for example, for a second order polynomial (k=2) to have 
a smaller standard deviation than a third order polynomial (k=3). In previous 
work on this project selection of the appropriate order polynomial for fitting 
a data segment was made by comparison of the values of the SDRK's. Note that 
the SSRK's represent the quality of mathematical fit of the polynomials to the 
data segment whereas the SDRK's represent the quality of statistical fit. 
Subsequently, it will be shown that the values of the FM's (yet to be defined) 
provide an even better basis for selection of the polynomial order to be used 
to fit a data segment. 

Attention should be directed to the role of the DFK's in establishing the 
SDRK's. If DFK is less than unity (DFK < 1) for any K, then a polynomial of 
order K cannot be fitted to the data segment using the L-S Method. For 
example, if there are 3 missing and outlier points in a 7-point data segment, 
then NS = 4 and a polynomial of order K=3 would have DF3 = 0 as its degress of 
freedom. ( A polynomial of order three can, at least theoretically, be fitted 
exactly to the four legitimate observations in the segment but the noise in 
those observations is also included in the fitting and no estimate of the 
magnitude of the noise can be made.) One other point should be stressed here. 
Suppose that a cubic polynomial is appropriate for the vehicular path but that 
DF3 < 1. The value of SDR1 and SDR2 include not only the effects of scatter 
but also components due to the i nappropriateness of the model. 

Instrumentation should be aware of the fact that the SDR's do not represent 
only scatter but can contain model error components. 

A final factor needs discussion before confidence intervals and FM's can 
be presented. Establishment of a confidence interval requires knowledge of a 
specific statistical distribution. In this application, appropriate assumptions 
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•lead to the Student T distribution for the residual errors. An extract of a 
table for this distribution (Ref. 7) is presented below when a confidence level 
of 0.95 is selected. 

DFK : 0 1 2 3 4 5 

T(DFK) : 99.999 6.314 2.920 2.353 2.132 2.015 

The value of T (0) should be infinity to indicate no confidence in the estimate. 
The value 99.999 has been introduced somewhat arbitrarily for computational 
convenience and will result in a wide confidence interval and a large value for 
FM. 

At last the background is set for the introduction of confidence intervals 
and the definition of the proposed Figure of Merit FM. A confidence interval 
for the true positional coordinate at any time T is specified by its two end- 
points, i.e., 

Cl (xt) = ( xe - T (DFK) *SDRK/SQR (NS), XE + T (DFK) *SDRK/SQR (NS)) 

Again in lay terms, this expression indicates that the actual value of position 
coordinate can be expected to be in this interval centered at the estimated 
value xe about 95% of the time. 

The proposed measure of quality is the statistical bound 

FMK = T (DFK) *SDRK/SQR (NS) 

for the difference between the actual value and the estimated value of a 
position coordinate. This difference can be expected to be less than FM about 
95% of the times that such bounds are calculated. Large values of FMK indicate 
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low confidence that xe is close to xt and small values of FMK indicate that xt 



should differ only slightly from xe. 

It would appear reasonable to shift the basis for selecting the appropriate 
order of the polynomial to be used to fit a data segment to that K which 
produces the smallest bound for the difference between xe and xt. Thus, the 
proposed figure of merit is 



Also, for this specific K, 



and 



FM = min(FMK). 
K 

DF = DFK, 



SDR = SDRK . 



Ill APPLICATION 



Calculation of FM's will be illustrated for a specific sample of NUWES data 
(investigator's sample 2.1AX). 

The sample data selected includes 40 observation times { t=212 1 to t=2160). 
In order to smooth and establish an estimate xe and an FM at each of these 
times, additional points were needed. Data for times t=2117 to t=2163 is 
presented in the first three columns of Table 1 and plotted in Figure 1. The 
first column contains the observation times, T. The second column gives the 
identity of the position location array providing the observation with -2 
denoting that no observation is available. The third column contains the 
observed values (the xo's). Note that there are 6 times with missing 
observations that have been filled by temporary values as mentioned in Section 
II and to be discussed later. 

In order to make the procedure clear, each step is described in some detail 
below. 

STEP 1 Establish temporary values for the x coordinate (xo) at the missing 
times. This was accomplished by using linear interpolation between adjacent 
observed values, thus, for example, the temporary value at time t=2120 was 
supolied by taking the average of the xo values at times t=2119 and t=2121. 
Missing points are identified as Questionable values (QP) in column 4 of Table 
1. Values of -1 and -2 are used to indicate Unscheduled and Scheduled missing 
points, respectively. The are also identified by circles in Figure 1. 
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TABLE 1 



Sample 



ARRAY 


xo 


QP 


K 


3 


33533.2 


0 




3 


33567.7 


0 




3 


33603.6 


0 




-2 


(33634.1) 


-2 


2 


3 


33664.6 


0 


2 


3 


33695.0 


0 


2 


3 


33718.5 


0 


3 


3 


33745.8 


0 


2 


3 


33767.1 


0 


2 


3 


33780.7 


0 


2 


3 


33798.0 


0 


3 


-2 


(33810.9) 


-2 


3 


3 


33823.7 


0 


3 


3 


33827.3 


0 


3 


3 


33794.2 


0 


2 


3 


33726.1 


0 


3 


3 


33637.7 


0 


3 


12 


33556.5 


0 


3 


12 


33486.6 


0 


3 


-2 


(33466.5) 


-12 


3 


3 


33446.5 


0 


3 


3 


33485.1 


0 


3 


3 


33559.3 


0 


3 


3 


33650.2 


0 


3 


3 


33738.5 


0 


3 


3 


33799.0 


0 


3 


3 


33825.8 


0 


2 


-2 


(33798.7) 


-12 


3 


3 


33771.6 


0 


3 


3 


33698.3 


0 


3 


3 


33607 .7 


0 


3 


12 


33528.5 


0 


3 


12 


33455.2 


0 


3 


3 


33381.2 


0 


3 


3 


33323.5 


0 


2 


-2 


(33285.1) 


-2 


2 


3 


33246.6 


0 


3 


3 


33219.5 


0 


2 


3 


33212.5 


0 


2 


3 


33221.0 


0 


3 


3 


33273 5 


-10 


2 


3 


33267.7 


0 


2 


3 


33313.5 


0 


3 


-2 


(33374.2) 


-2 


3 


3 


33434.8 


0 




3 


33434.8 


0 




3 


33510.5 


0 




3 


33592.5 


0 





NWS2A1X 

DF xe R FM 



3 


33634.9 


(-0.8) 


1.51 


3 


33665.3 


-0.7 


1.58 


3 


33693.5 


1.5 


1 .47 


2 


33720.9 


-2.4 


2.20 


4 


33744.6 


1.2 


2.02 


3 


33764.6 


2.5 


3.14 


3 


33783.7 


-3.0 


3.26 


2 


33799.7 


-1.7 


3.13 


2 


33816.3 


(5.4) 


4.34 


2 


33828.1 


- 4.7 


4.01 


2 


33821.6 


5.7 


7.02 


3 


33786.6 


7.6 


11.69 


3 


33724.0 


2.1 


2.84 


2 


33640.1 


-2.4 


2.20 


2 


33556.9 


-0.4 


4.27 


2 


33489.9 


-3.3 


2.88 


2 


33452.2 


(14.3) 


3.94 


2 


33451.1 


-4.6 


5.45 


2 


33489.8 


-4.7 


7.40 


2 


33560.9 


-1.6 


2.82 


3 


33649.8 


0.4 


1.95 


2 


33734.5 


4.0 


4.63 


2 


33795.6 


3.4 


6.25 


3 


33823.1 


2.7 


2.33 


2 


33813.0 


(14.3) 


2.41 


2 


33767.4 


4.2 


5.58 


2 


33695.6 


3.7 


4.33 


2 


33614.7 


-7.0 


7.84 


3 


33529.8 


-1.3 


4.04 


3 


33451.7 


3.5 


3.67 


2 


33383.4 


-2.2 


2.33 


3 


33325.2 


-1.7 


3.42 


3 


33277/9 


(7.2) 


2.56 


2 


33243.5 


3.1 


2.75 


2 


33222.1 


-2.6 


3.58 


2 


33214.0 


-1 .5 


3.46 


2 


33218.6 


2.4 


2.71 


2 


33237.8 


35.7 


2.37 


2 


33269.4 


-1.7 


1.96 


1 


33312.9 


0.6 


1.29 


1 


33369.0 


(5.2) 


1.28 



10 



STEP 2 Check for potential outliers. Sequential differences (Ref. 2) were 
calculated for the sample, (these calculations are omitted here.) Values of 
the fourth order differences (D4) which were greater than 50 in magnitude were 
considered to indicate potential outliers. When adjacent values of D4 also 
exceed 50, the observation with the largest D4 is considered to be an outlier. 
These are labeled by the value -10 in column 4 of Table 1. Temporary values 
that are also outliers are identified in column 4 by changing the values to -11 
and -12 for unscheduled and scheduled missing values, respectively. 

The procedure for identifying outliers is illustrated by examining the 
values of D4 at times t=2135 where D4=-88.2, at t=2136 where D4=108.7, and 
at t=2137 where D4=-92.5. The observation at t=2136 is considered to be an 
outlier. Since it is at a scheduled missing point the appropriate value in 
column 4 is -12 

Outliers were also located at located at t=2144 and t=2157. These three 
outiers are indicated by boxes in Figure 1. 

STEP 3 Treatment of Outliers. Outliers are treated by iterations of the 7-point 
L-S smoothing method so that the resulting estimate is not contaminated by the 
outlier value. Iterations were continued until the residual errors (R) at the 
times of the outlier and any missing points in the data segment were less than 
uni ty . 

(The results presented here were calculated on a TI-59 calculator and the 
polynomial order was selected on the basis of the SDRK's. Results may be 
different when the FM's are used to select the appropriate polynomial order.) 
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TABLE 2 



Sequential Differences t = 2136 



A. Before Smoothing 



t 


xo 


D1 


D2 


D3 


D4 


2133 


33637.7 










- 


- 


-81.2 


- 






2134 


33556.5 


- 


11.3 


- 




- 


- 


-69.9 


- 


38.5 


- 


2135 


33486.6 


- 


49.8 


- 


-88.4 


- 


- 


-20.1 


- 


-49.9 


- 


2136 


33466.5 


- 


-.1 


- 


109.0 


- 


- 


-20.2 


- 


59.1 


- 


2137 


33446 .3 


- 


59.0 


- 


- 82.7 


- 


- 


38.8 


- 


-23.6 


- 


2138 


33485.1 


- 


35.4 


- 




- 


- 


74.2 


- 






2139 


33559.3 


- 









B. After Smoothing 



t 


xo 


D1 


D2 


D3 


D4 


2133 


33637.7 


- 








- 


- 


-81.2 


- 






2134 


33556.5 


- 


11.3 


- 




- 


- 


-69.9 


- 


24.2 


- 


2135 


33486.6 


- 


35.5 


- 


-17.2 


- 


- 


-34.4 


- 


7.0 


- 


2136 


33452.2 


- 


28.5 


- 


7.2 


- 


- 


5.9 


- 


14.2 


- 


2137 


33446 .3 


- 


44.7 


- 


-23.5 


- 


- 


38.8 


- 


-9.3 


- 


2138 


33485.1 


- 


35.4 


- 




- 


- 


74.2 


- 






2139 


33559.3 


- 
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TABLE 3 



Sequential Differences t = 2157 



A. Before Smoothing 



t 


xo 


D1 


D2 


D3 


D4 


2154 


33219.5 


- 








- 


- 


-7.0 


- 






2155 


33212.5 


- 


15.5 


- 




- 


- 


8.5 


- 


28.5 


- 


2156 


33221.0 


- 


44.00 


- 


-130.8 


- 


- 


52.5 


- 


-102.3 


- 


2157 


33273.5 


- 


-58.3 


- 


212.2 


- 


- 


-5.8 


- 


109.9 


- 


2158 


33267.7 


- 


51.6 


- 


-146.7 


- 


- 


45.8 


- 


-36.8 


- 


2159 


33323.5 


- 


14.8 


- 




- 


- 


60.7 


- 






2160 


33374,2 


- 









B. After Smoothing 



t 


xo 


D1 


D2 


D3 


D4 


2154 


33219.5 


- 








- 


- 


-7.0 


- 






2155 


33212.5 


- 


15.5 


- 




- 


- 


8.5 


- 


-6.4 


- 


2156 


33221.0 


- 


9.1 


- 


8.4 


- 


- 


17.6 


- 


2.4 


- 


2157 


33238.6 


- 


11.5 


- 


2.8 


- 


- 


29.1 


- 


5.2 


- 


2158 


33267.7 


- 


16.7 


- 


-13.1 


- 


- 


45.8 


- 


-7.9 


- 


2159 


33313.5 


- 


8.8 


- 




- 


- 


54.6 


- 






2160 


33368.1 


- 
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The outlier at t=2136 required 4 iterations. In each iteration a cubic 
polynomial was used so that DF=DF=2. These values are shown in columns 5 and 6 
of Table 1. The value FM=FM3=3.94 for this point indicates a fair amount of 
uncertainty. (Values of less than 2 occur when the position location arrays are 
performing well and the polynomial is a good representation of the actual 
vehicular path.) Nevertheless, the value of FM indicates that the estimate xe 
can be expected to be within 4 units of the true value of xt about 95 % of the 
time. The value of FM is given in column 8 of Table 1. 

It is of some interest to note that the residual error (R) between the 
original temporary value x0=33466.5 and the estimate xe=33452.2 is R=14 .3 which 
is more than three times as large as the value of FM and hence the indication of 
an outlier at this time was correct. The value of the residual error is given 
in column 7. 

Calculations of sequential differences before and after smoothing are 
presented in Table 2. The potential outliers at t=2135, 2136, and 2137 are no 
longer present when the temporary value at t=2136 is replaced by its smoothed 
value. 

Treatment of the outlier at t=2144 was similar. Again, four iterations 
were required. The value of FM (2.41) is quite low and the residual error 
R=14.3 indicates that the temporary value at this time was inconsisten with its 
neighboring values. Sequential differences were not recalculated. 

The outlier at t=21 57 provides a slight variation since the end point of 
the segment (t=2160) is also a missing point. If, in the smoothing process for 
treating the outlier at t=2157, this temporary value were treated as an observed 
value, the L-S smoothing program would give a weight for this temporary value 
equal to that of the legitimate observed value. In effect, this would give a 
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weight of 1.5 for the observation at t=2159 and a weight of 0.5 for the 
observation at t=2161. In order to avoid this unbalanced weighting, the 
temporary value at t=2160 was replaced by its smoothed value at each iteration. 
The smoothing was then continued until the residual errors at both times t= 2 1 5 7 
and t=2160 were less than unity. The purpose of this is to insure that only the 
NS=5 legitimate observations are retained in the smoothing process. The value 
of FM (2.37) is again quite low. The error R=35.7 supports the decision to 
consider the observation at t=21 57 an outlier. This is also supported by the 
chnage in the sequential differences shown in Table 3. Note that for this point 
NS=7-2=5 and that with K=2 the value of DF is 2. 

STEP 4 Treatment of missing points, the treatment of missing points is the same 
as that for outliers. As with outliers, iterations were continued until the 
residual errors were less than unity. The results are shown in Table 4 and 
Table 1. 

STEP 5 Treatment of remaining points. The other points in the sample were 
smoothed without iterations since they were considered to be legitimate 
observations. In each treatment the smoothed values of the missing and outlier 
points were used. The values used for NS and the DF's were reduced to represent 
the number of legitimate points in each data segment. At several times both the 
residuals and the FM's were quite large (e.g., at t=2131, R=7.6 and FM= 11.7) . 
Reference to Figure 1 suggests that the vehicle was changing course quite 
rapidly and that these large values may be due to inadequacy of the model 
(polynomial) rather than to increased scatter. In either case there is a 
degradation of quality at this point and this degradation is reflected in the 
large value for FM. 
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IV CONCLUSIONS AND RECOMMENDATIONS 



The proposed Figure-of -Merit provides a numerical expression for the 
quality of NUWES data. It includes the effects of missing ponts and outliers on 
the smoothing process. It also includes the effects of the use of an inadequate 

ordered polynomial in the smoothing process. In essence, the proposed FM 
provides a numerical bound for the differences that can be expected between the 
estimated values and the actual values of the vehicles coordinates at the 
wequence of observational times. 

It is suggested that the FM's can be used to represent the quality of data 
for specific NUWSES trials and for segments of those trials. It can also be 
used to indicate the capabilities of specific position location arrays. This 
latter application could include evaluation of data collected by specific arrays 
over several trials and thus could be used as an indicator of degradation of 
array capabilities. Further, on sorting of FM's by distance of vehicles from an 
array, they could be used to establish relationships between these factors. 



As previously indicated, the results presented in Table 1 were obtained 
using a TI-59 calculator with severely limited program capabilities and hence 
involving considerable operator interaction. The translation and extension of 
the process to BASIC language for use on an IBM PC has been initiated. It is 
recommended that this effort be continued. 
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